Parallel Neural Network Features for Improved Tandem Acoustic Modeling
نویسندگان
چکیده
The combination of acoustic models or features is a standard approach to exploit various knowledge sources. This paper investigates the concatenation of different bottleneck (BN) neural network (NN) outputs for tandem acoustic modeling. Thus, combination of NN features is performed via Gaussian mixture models (GMM). Complementarity between the NN feature representations is attained by using various network topologies: LSTM recurrent, feed-forward, and hierarchical, as well as different non-linearities: hyperbolic tangent, sigmoid, and rectified linear units. Speech recognition experiments are carried out on various tasks: telephone conversations, Skype calls, as well as broadcast news and conversations. Results indicate that LSTM based tandem approach is still competitive, and such tandem model can challenge comparable hybrid systems. The traditional steps of tandem modeling, speaker adaptive and sequence discriminative GMM training, improve the tandem results further. Furthermore, these “old-fashioned” steps remain applicable after the concatenation of multiple neural network feature streams. Exploiting the parallel processing of input feature streams, it is shown that 2-5% relative improvement could be achieved over the single best BN feature set. Finally, we also report results after neural network based language model rescoring and examine the system combination possibilities using such complex tandem models.
منابع مشابه
شبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملA Hybrid Neural Network Approach for Kinematic Modeling of a Novel 6-UPS Parallel Human-Like Mastication Robot
Introduction we aimed to introduce a 6-universal-prismatic-spherical (UPS) parallel mechanism for the human jaw motion and theoretically evaluate its kinematic problem. We proposed a strategy to provide a fast and accurate solution to the kinematic problem. The proposed strategy could accelerate the process of solution-finding for the direct kinematic problem by reducing the number of required ...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملJoint Modeling of Text and Acoustic-Prosodic Cues for Neural Parsing
In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses. For automatically parsing a spoken utterance, we introduce a model that integrates transcribed text and acoustic-prosodic features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and word...
متن کاملInvestigations into tandem acoustic modeling for the Aurora task
In tandem acoustic modeling, signal features are first processed by a discriminantly-trained neural network, then the outputs of this network are treated as the feature inputs to a conventional distribution-modeling Gaussian-mixture model (GMM) speech recognizer. This arrangement achieves relative error rate reductions of 30% or more on the Aurora task, as well as supporting feature stream comb...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017